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ABSTRACT 

We present a galaxy group-finding algorithm, the Photo-z Probability Peaks (P3) al- 
gorithm, optimized for locating small galaxy groups using photometric rcdshift data 
by searching for peaks in the signal-to-noise of the local overdensity of galaxies in a 
three-dimensional grid. This method is an improvement over similar two-dimensional 
matched-filter methods in reducing background contamination through the use of red- 
shift information, allowing it to accurately detect groups at lower richness. We present 
the results of tests of our algorithm on galaxy catalogues from the Millennium Simu- 
lation. Using a minimum S/N of 3 for detected groups, a group aperture size of 0.25 
ft. Mpc, and assuming photometric redshift accuracy of a z = 0.05 it attains a purity 
of 84% and detects ~ 295 groups/deg. 2 with an average group richness of 8.6 members. 
Assuming photometric redshift accuracy of a z — 0.02, it attains a purity of 97% and 
detects ~ 143 groups/deg. 2 with an average group richness of 12.5 members. We also 
test our algorithm on data available for the COSMOS field and the presently-available 
fields from the CFHTLS-Wide survey, presenting preliminary results of this analysis. 
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1 INTRODUCTION 

Most galaxies in the Universe are gravitationally bound to 
one or more other galaxies within galaxy groups. A num- 
ber of recent studies have indicated that the mass-to-light 
ratios of groups may be a steep function of the group 



mass ( |Marinoni fc Hudson 2002 ; E ke et al.|2004||Parker et al 



2005 Weinmann et al.|2006 l. This phenomenon may be due 
to the presence of a critical halo mass above which star 



formation is efficiently quenched (Dekel & Birnboim 2006 



Gilbank & Balogh 2008). Clearly, it is of interest to im- 



prove existing data on the mass-to-light ratio on the mass 
scale close to that of groups, in order to better determine 
whether, for example, such a critical halo mass exists, and, 
more generally, to determine what mechanisms may be re- 
sponsible for the quenching of star formation in the group 
environment. 

The analysis of the mass-to-light ratios of poor groups 
has been limited by the difficulty both in identifying them, 
and in estimating their masses. The identification of groups 
is typically based on spectroscopic redshifts, with galaxies 
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assigned to groups through methods such as the Friends-of- 
Friends algorithm ( |Huchra fc Geller|1982| >. There are several 
methods to estimate the masses of groups: X-ray-derived 
masses, a method that is limited to rich groups (?); virial 
estimates based on redshifts; and weak gravitational lensing 
(WL). 

Virial mass estimators are problematic for small groups: 
the virial mass estimator scales as a 1 , where a is the velocity 
dispersion of the group. The accuracy of this method is tied 
tightly to the number of galaxies in the group; for example, 
for iV mcm = 6, the estimated a is uncertain to a factor of 



2, leading to large uncertainty in the virial mass (Knobel 



et al. 2009). More importantly, the estimator assumes that 
the group has reached dynamic equilibrium and that the 
orbital velocity anisotropy is known; if these assumptions 
are incorrect, it may lead to a systematic bias. 

WL has an advantage over virial estimators because 
the mass estimates are independent of the current dynami- 
cal state of the group. WL mass estimates are particularly 
valuable for poor groups, for which X-ray-derived masses are 
unobtainable, and virial estimates are most uncertain. How- 
ever, the signal-to-noise (S/N) for a single group is very low, 
so it is necessary to "stack" the signal from many groups. 
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Furthermore, the lensing mass is sensitive to all overdensi- 
ties along the line of sight, so this requires careful calibration 
with simulations. Previous weak lensing studies of poor sys- 
tems include ? |Parker et aL] ( |2005[ ); [Sheldon et aL] ( |2009[ ) who 
studied samples of 59, 116, and 132 473 systems respectively. 

This paper is the first in a series based on data from 
the CFHTLS-Wide survey ( |CFHTLS||2009] ), presenting the 
method we will be using to identify groups in the CFHTLS- 
Wide. In future work, we expect to use weak lensing to es- 
timate the masses of groups in the CFHTLS-Wide. To date 
there has been no spectroscopic survey of the entire 170 
square degrees of the CFHTLS-Wide, so here we will use 
photometric redshifts to assign galaxies to groups. Photo- 
metric redshifts have significantly larger random errors than 
spectroscopic redshifts (> 0.02 versus < 0.001) (IBem'tez 



2000). Due to the large photometric redshift errors, any lden- 



tified groups will suffer significant contamination from field 
galaxies. These projection effects will need to be carefully 
calibrated and corrected when estimating group richness. 

Previous methods of detecting groups and clusters us- 
ing only photometric data have focused on clusters, and/or 
on red-sequence galaxies. The cluster-red-sequence method 



(7Lu et al. 20091, which searches for overdensities of red- 



sequence galaxies, is optimized for clusters with more than 
20 red-sequence galaxies, significantly above the mass scale 
of interest for this project. The K2 method developed by 
? can generate a catalog of ~ 99% purity, with reason- 
able completeness for poor clusters. However, the method 
assumes that galaxies in the same group will have very sim- 
ilar colours. This assumption may result in a significant bias 
toward groups where the galaxies are all at a similar stage of 
evolution, particularly among poor groups. The probability- 
friends-of-friends algorithm (Li & Ycc 2008) is better suited 
to the requirements of this project, producing ~ 90% purity 
for groups with 8 or more members, but the purity decreases 
rapidly below this point, to ~ 70% for groups with 5 or more 
members. 

With all group/cluster-finding methods there is a trade- 
off between purity and completeness ( [Knobel et al.| [2009). 
We expect to have a very large sample of groups, and so 
can afford to sacrifice completeness for purity in our group 
selection. Although the probability-friends-of-friends algo- 
rithm might be usable with some refinements, our method 
has shown to be able to produce significantly higher purity 
(~ 80%— ~ 95% for groups of 5 or more members, depend- 
ing on the quality of the photometry), albeit at the expense 
of completeness. With the large quantity of data available 
in the CFHTLS-Wide survey, this is an acceptable trade-off. 

Our method, which we refer to as the Photo-z Proba- 
bility Peaks (P3) algorithm, involves finding peaks in three 
dimensions using the photo-z probability distribution func- 
tions (PDFs). This method is similar in spirit to the cluster- 
finding algorithms recently published by ? and Ad ami et al!) 
(20101. Whereas their methods are optimizing for detect- 



ing clusters with high completeness, our method is tuned to 
finding groups, and assembling a group catalogue that has 
high purity. 

In Section 2 of this paper we explain our method for 
identifying galaxy groups. Section 3 gives the results of tests 
of our method on simulated and observed data sets, for dif- 
ferent choices of the algorithms parameters and for different 
assumptions regarding the accuracy of the photometric red- 



shifts. Section 4 discusses the applicability of this method 
to the galaxy catalogues from the CFHTLS-Wide survey, 
including preliminary results and comparisons with other 
group catalogues made from these fields. Throughout we 
adopt a cosmology with the following parameters: fi m = 0.3, 
Q.a = 0.7, and Ho = 70km/s/Mpc. All magnitudes are in the 
AB system unless stated otherwise. 



2 GROUP FINDING METHOD 

The methodology behind the P3 algorithm involves search- 
ing for significant overdensities in the distribution of galaxies 
in 3Ds. Specifically, to search for overdensities, we construct 
a three-dimensional grid of points within the lightcone of 
field to be analyzed, and at each point, we calculate the lo- 
cal overdensity of galaxies in a circular aperture surrounding 
the point, and compare this to the nearby background in an 
annulus surrounding the point. 

In practice, we adopt a 3D grid with a spacing of 
R g =~ 0.2 comoving /i _1 Mpc in the transverse direction, 
with each redshift slice having a thickness of z s = 0.02. A 
small galaxy group will have a radius of ~ 0.25 /i _1 Mpc, so 
it is resolved with this spacing. The typical photo-z errors 
are ~ 0.04, so are also resolved with this spacing. However, 
high quality photometric redshifts, such as those provided 



by Ilbert et al. ( 2009 1 may have lower errors and require a 



finer grid-spacing. 

Our calculation for the galaxy surface density within 
the aperture (represented by p ap , though note that the cal- 
culated density is only pseudo-3D, as we use a probability 
density in the z-dimension) . The method is designed to han- 
dle regions of the sky which have been masked (e.g. due to 
bright stars). The procedure is as follows: 

• For each galaxy, use the photometric redshift probabil- 
ity density function - here determined by via the Bayesian 
Photometric Redshift Estimation (BPZ) code ( |Bemtez| 
19991 ) - to obtain the probability (pi) that it is within 



a given redshift slice of thickness z s . 

- In this paper, our algorithm approximates the PDF 
as a Gaussian distribution to decrease computation time 
(but can easily be generalized to arbitrary PDFs). 

- We multiply the redshift probability calculated above 
by the BPZ ODDS parameter provided by the photomet- 
ric redshift method. In BPZ, the ODDS parameter gives 
the probability that the true redshift lies within the pri- 
mary peak of the PDF, and so is necessary in normalizing 
the Gaussian height. 

• Sum the weighted probabilities for all galaxies within 
the aperture and divide by the area of the aperture which 
falls outside any masked regions (-A ap ). This gives us the 
density within the aperture, as in Equation jT]). 



Pap = 



(1) 



This procedure can also be used with only minor modifi- 
cations to determine the density within the annulus sur- 
rounding the test point, which will give us the local back- 
ground galaxy density, and thus the overdensity 8. For our 
purposes, we use an annulus with inner radius of 1 /i -1 Mpc 
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and outer radius 3 /i _1 Mpc. This large size minimizes the ef- 
fect of large-scale structure such as superclusters, filaments, 
and walls on the calculated S, while it remains small enough 
to be representative of density variation caused by observa- 
tional effects. 

In order to obtain a pure sample of groups, we will se- 
lect only groups with a sufficiently high signal-to-noise ratio 
(S/N) in S. In order to calculate the noise in our measure- 
ment of 5, we model the number of galaxies which make a 
significant contribution to the density as a Poisson distribu- 
tion. In the procedure above, we include only galaxies that 
have a probability of being within this redshift slice of at 
least 0.1%, rounding other probabilities down to zero. Us- 
ing Poisson statistics, a sample which finds n contributing 
galaxies would give us a standard error of y/n. We can then 
estimate the Poisson error of our density as: 



et al. ( 2009 1 using the zCOSMOS 10k sample covering the 



<^~ap, Poisson 



{■WiPi) * y/n 



Pap 



(2) 



We can then repeat these calculations for the annulus, giving 
the error in its density. In the end, we combine these errors 
in quadrature to give the final noise in 8. This allows us to 
calculate the S/N for each test point. 

With our 3D grid of S/N, we then proceed to detect the 
peaks, as these are most likely to correspond to the centres 
of galaxy groups. In order to not identify multiple peaks 
with the same group, we apply a threshold distance of R t = 
0.5/i _1 Mpc, the size of a large group, and a threshold of 
z t — 0.02 in redshift in which a peak must be the highest 
point, rather than simply requiring that the peak must be 
higher than the points immediately surrounding it in the 
grid. This procedure minimizes the chance of groups being 
detected at multiple points in the sky, as any substructure 
that lies within R t of the group centre will be ignored. The 
S/N of a rich group also tends to steadily increase toward 
the centre, so even substructure more distant than R t from 
the centre is likely to lie within R t of at least one point 
with greater S/N, and thus it will also be ignored. Multiple 
detections along the line of sight are more difficult to handle, 
as the scatter of photometric redshifts within a rich cluster 
can approach 0.2 (?). However, this is primarily a problem 
with richer groups, as a large number of galaxies is necessary 
for multiple peaks to be observed at redshifts separated by 
more than zt = 0.02. Future refinements to the algorithm 
may work to address this issue by merging multiple peaks 
that lie along the line of sight. 

Once the peak catalogue is complete, we then extract 
only those peaks above some signal-to-noise threshold. This 
leaves us with our ultimate group catalogue. The detail of 
how the signal-to-noise thresholds and the aperture sizes are 
chosen are described in the following section. 



3 TESTS OF THE GROUP FINDING 
ALGORITHM 

In order to test the P3 algorithm, we compared its results 
to a catalogue of dark matter halos containing at least two 
galaxies in six lightcones extracted from the Millennium sim- 
ulation by Kitzbichl er fc White] ( |2007[ ), and to a friends-of- 
friends spectroscopic group catalogue generated by |Knob"er] 



COSMOS field, which overlaps with the CFHTLS D2 field. 



3.1 Comparison with Simulations 

To assess the accuracy of the P3 algorithm against an ideal 
catalogue, we used six simulated 2 deg. 2 lightcones extracted 



from the Millennium simulation ( Springel et al.|2005 De Lu 



cia fc Blaizot 20071 by Kitzbichler fc White (|2007 1 . Given 



the resolution limits of the Millennium simulation, the cata- 
logue is complete for Johnson / < 24 in the AB system. We 
also used a magnitude limit of / < 22.5 in the /-band for 
most of the testing, as this matches the spectroscopic cata- 



logue of Knobel et al. (see Section 3.4 below). We also tested 



including galaxies with /-band magnitudes between 22.5 and 



24 to assess how this affected our accuracy (see Section 3.2.2 



also below). For our testing, we used only galaxies between 
z\o = 0.2 and Zhi = 0.8, as this is where we expect to attain 
the best lensing signal. 

To simulate photometric redshifts for this galaxy cata- 
logue, for simplicity we applied a Gaussian deviate to the 
redshifts of the galaxies. We generated two mock photo- 
z catalogues, each using different simulated photo-z errors. 
The first mock catalogue, hereafter CFHTLSpz, simulated 
the accuracy of the photometric redshifts in the CFHTLS 



Deep fields of Ilbert et al. ( 2006 1 , with a redshift error of 0.05 



for / < 22.5, and 0.10 for 22.5 < / < 24. The second cat- 
alogue, hereafter COSMOS30pz, mimicked the accuracy of 
the COSMOS-30 ( jllbert et al.|2TJ09l ) photometric redshifts: 
0.02 for / < 22.5, and 0.04 for 22.5 < I < 24. We note that, 
after these tests were done, a recent analysis of photo-z's 
in the CFHTLS- Wide survey ( |Hildebrandt et al.|2009] ) sug- 
gests that the errors in this survey will be approximately 
0.03 for I < 22.5 and 0.2 < z < 1.1, which lies between the 
two error ranges tested in this paper. 

Catastrophic errors, where the actual redshift of the 
galaxy differs from its photometric redshift by many stan- 
dard deviations, were not simulated in our tests. With real 
data, we will be able to select only galaxies that have a min- 
imal chance of being catastrophic errors. This selection will 
likely result in a slightly less complete catalogue, but should 
guarantee that the purity is not decreased due to catas- 
trophic errors. Within the redshift range of 0.2 < z < 0.8, 
we expect the fraction of catastrophic errors to be less than 
5% (?, Hildebrandt et al. in preparation). 

The Millennium simulation also contains halo informa- 
tion for galaxies. We used halos identified by |De Lucia et aL] 
( |2006[ ) through a friends-of-friends method applied to the 
dark matter particles to identify the real groups of galaxies. 
The halo centres were determined to be the average positions 
of the galaxies contained in the halo. Only halos containing 
at least two galaxies were used in the group comparisons. 
In total, we obtained 31668 groups across the six fields, for 
approximately 2600 groups per deg. 2 

Fig. [I] and Fig. [2] show a graphical representation of the 
S/N calculated by the P3 algorithm for a selection of red- 
shift slices, to illustrate how the detected peaks correspond 
to actual groups. Recall that peaks are detected in three di- 
mensions, so what appear to be peaks in the individual 2D 
plots may actually be detected on another slice. Addition- 
ally, peaks have a threshold radius R t = 0.5 h" 1 Mpc within 
which they must be the highest point to count as a peak, 
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Figure 1. The calculated S/N for the 5 of galaxies on a grid of points in R.A. and Dec, sliced at different values of the redshift, for a 
field drawn from the Millennium simulation by |KitzbichIer fc~W hitc (2007]|. S/N is indicated by the colour. Locations of halos with at 
least 3 galaxies are indicated by white circles, with their radii indicating the richnesses of the groups. White crosses indicate the location 
of a circle in a nearby layer, within Az = 0.04. Detected peaks with a S/N > 3 are indicated by the black diamonds. Peaks are detected 
in three dimensions, so what appear to be peaks in the individual 2D plots may actually be detected on another slice. Additionally, peaks 
have a threshold radius, Rt = 0.5 /i — 1 Mpc, within which they must be the highest point to count as a peak, so some peaks may not be 
detected if they are sufficiently close to another peak. Left column contains plots using CFHTLSpz errors with i? a p = 0.5 h _1 Mpc, right 
column contains plots using COSMOS30pz errors with i? ap = 0.5/i _1 Mpc. Redshift slices, from top to bottom: 0.58, 0.60, 0.62. 
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Figure 2. An alternate view of the plots from Fig.JT] sliced at constant Dec, showing how far groups extend in the redshift dimension. 
The left column shows simulations with CFHTLSpz errors, and the right column shows simulations with COSMOS30pz errors. The top 
row shows simulations with i? ap = 0.5h~ 1 Mpc, and the bottom row shows simulations with i? ap = 0.25/i -1 Mpc. 



so some peaks may not be detected if they are sufficiently 
close to another peak. 

3.1.1 Purity and Completeness 

In order to assess purity and completeness, it is first nec- 
essary to define what constitutes a match. Our comparison 
method aimed primarily to assess the purity of our sam- 
ples, so a peak is defined as a match if it lies within a 
redshift difference of 2 ma t = 0.04 and a projected radius 
fl ma t = 0.5 ft Mpc of at least one group-containing halo. 
These parameters were adopted because z mat is approxi- 
mately twice the uncertainty in the mean photometric red- 
shift for a group of 5 members, and i? ma t is approximately 
the upper size limit for a group. The comparisons were made 
for peaks selected above various S/N, as well as a "Control" 
sample which consisted of positions generated from a uni- 
form random grid in R.A., Dec, and z. The purity, P is 
then defined as the fraction of P3-peaks that match to a 
Millennium group-halo. 

Although completeness is not the primary goal of P3, 
we nevertheless measured completeness, C, by calculating 
the number of spectroscopically-identified groups in the field 



which had at least one P3 group matched to it. Table[T]shows 
a summary of the accuracy of the P3 algorithm when run 
on a mock galaxy catalogue from the Millennium simula- 
tion, using the catalogue of groups with at least 2 members 
as a comparison. For a fiducial minimum S/N limit of 3 
and _R ap = 0.25/i _1 Mpc, the P3 algorithm typically detects 
around 295 groups/deg. 2 in the redshift range 0.2 < z < 0.8 
for the simulated galaxy catalogue. Of these detected groups, 
approximately 84% match to at least real group with at 
least two bright members when we simulate CFHTLSpz er- 
rors. This will give us approximately 248 correct group de- 
tections per deg. 2 . Our completeness is very low, however, 
picking up at best 44% of groups when we use a S/N cut 
of 2 and i? ap = 0.25 h^ 1 Mpc. For our purposes, this is not 
a concern, as the total number of groups we detect in the 
CFHTLS-Wide will be enough for our goals. 



When we simulate COSMOS30pz errors, also using a 
minimum S/N limit of 3 and i? ap = 0.25/i _1 Mpc, the purity 
increases to 97%, with 143 groups/deg. 2 detected and 139 
of these being real. 
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Table 1. Summary of the purity and completeness of the P3 algorithm when its results arc matched to the Millenium halo catalogue, 
containing 31668 total groups, for various minimum S/N limits and a Control. The combined galaxy catalogues cover a total of 12 deg. 2 
of simulated sky. Cut is what, if any, S/N limit is applied to the peak catalogue; i? ap is the aperture radius in h~ Mpc; Nhit is the 
number of detected peaks which match to a halo; Npeak is the total number of detected peaks; P is the fraction of peaks which match 
to a halo; C is the fraction of halos which match to a detected peak; (JV m ) is the mean number of members in the detected peaks which 
match to a halo; (m) is the geometric mean mass of the detected peaks which match to a halo, in units of 10 12 h~ 1 Mq; (-/V mat ) is the 
mean number of halos within the matching distance of a peak (not including peaks which match to zero halos); and (A^fLeld ) i s * ne mean 
number of field galaxies within the matching distance of a peak (also not including peaks which match to zero halos). 
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Table 2. Summary of the purity and completeness of the P3 algorithm when its results are matched to the Millennium halo catalogue, 
containing 31668 total groups, for various minimum S/N limits and a Control. Here the P3 algorithm used a smaller grid spacing, 
R g =~ 0.1 ft - *Mpc and z g = 0.1, and iJ ma t was reduced to 0.25/i _1 Mpc. The combined galaxy catalogues cover a total of 12 deg. 2 of 
simulated sky. Columns are as Table [I] 
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2.93 


620 


746 


0.83 


0.02 


22.31 


6.17 


1.77 


3.47 



3.1.2 Influence of resolution and matching radius 

Despite the apparently high purity of our results above, 
there may be some issues with overmatching. The typical 
size for a poor galaxy group is ~ 0.25 h~ Mpc, so we would 
not expect a S/N peak that corresponds to this galaxy to 
be separated from the galaxy's center by more than this 
amount. A smaller matching radius would decrease the num- 
ber of spurious matches, but it may also cause us to lose 
some real matches. To test the impact of this, we ran P3 
with a higher resolution than normal: R g =~ 0.1 comoving 
h~ Mpc and redshift slices having thickness of z g = 0.01. 
This resolution allows peaks to be potentially more than a 
single grid spacing away from a group center and still resolve 
as a match. 

Table [2] shows the results of this high-resolution run 
of the P3 algorithm, when its results are matched to the 
halo catalogue using a transverse matching length of rmat = 
0.25/i _1 Mpc instead of the usual 0.5 ft, _1 Mpc. The benefit 
of the smaller aperture size is much larger with this lower 
matching length, implying that the lower aperture size al- 
lows more precise determination of the positions of groups. 
Although the purity shown here is overall lower than with 



the larger matching length, the difference in purity relative 
to the Control is now larger. 



3.1.3 Richness of matched groups 

The typical group matched by the P3 algorithm has 5-15 
members, though this number depends on what signal-to- 
noise cut and which level of photo-z errors we used. Even 
though it might seem that a high signal-to-noise cut would 
significantly bias us toward highly populated groups, the 
fact that there are many more poor groups than rich groups 
means that many of these groups will, by chance, have a 
large signal-to-noise and be detected by our algorithm. This 
effect can easily be seen in Fig. |J Fig. g and Fig. [5] 

There are some additional potential issues with the 
method we've used to assess P3's accuracy. The use of a 
strict matching radius between a peak location and a group 
centre means that we may miss some larger systems if the 
detected peak lies far enough from the calculated centre of 
the halo, even if the peak does lie within the group. This 
effect is lessened by using a larger matching length, but this 
also makes detections more prone to background contami- 
nation. 
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Figure 3. The 8 calculated by our algorithm at the location of every group-containing halo (small blue dots) versus the number of 
members in those groups for the Millennium fields. Also shown are the <5 of all estimated groups and the number of bright (/ < 22.5) 
members of the halo they match to (large red dots with error bars). The numbers of members for all data points have a random component 
of less than 1 included in order to aid viewing. The left column shows simulations with CFHTLSpz errors, and the right column shows 
simulations with COSMOS30pz errors. The top row shows simulations with i? a p = 0.5 /i _1 Mpc, and the bottom row shows simulations 
with i? ap = 0.25 h~ 1 Mpc. 



In order to estimate the richness of our detected groups, 
we investigated whether the local 8 could be used to estimate 
the number of members contained in the group. Fig.[3]shows 
the 8 versus the number of bright (/ < 22.5) members in the 
matched halo for each photometrically-detected group that 
matched to at least one halo. We also calculated the 8 at 
the location of every halo for further data. 

Although there is a weak correlation between the num- 



ber of members in a group and its 8, using this to estimate 
the number of members is problematic. This is primarily 
due to the fact that there are many more poor groups than 
rich groups, and due to the large photometric uncertainties, 
many of these poor groups will have their 8 scattered to 
high values. For instance, a group with 5 = 4 (measured 
with 7? ap = 0.25 /i _1 Mpc) has a roughly equal chance to 
have 2 members as it does to have 10. Given this, a simple 
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Figure 4. The number of groups detected with a given num- 
ber of members for the Millennium fields, using a minimum S/N 
cut of 3. Solid black: CFHTLSpz errors, 
Dotted blue: COSMOS30pz errors, _R a 
CFHTLSpz errors, R ap 
COSMOS30pz errors, _R ap 



flap = 0.5 /i- 1 Mpc. 
h~ 1 Mpc. Dashed red: 
0.25 ft _1 Mpc. Dash-dotted magenta: 
: 0.25 h~ 1 Mpc. 
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Number of Members 

Figure 5. The number of groups detected with a given number 
of members for the Millennium fields, using all peaks detected, 
CFHTLSpz errors, and R ap = 0.25 h~ 1 'Mpc, for various cuts on 
5. Dotted magenta: 6 > 2. Short dashed red: S > 3. Long dashed 
blue: S > 4. Solid black: S > 5. 



Table 3. Summary of the purity and completeness of the P3 
algorithm using _R ap = 0.5 h^Mpc, CFHTLSpz errors (compa- 
rable to Table [l] top row, left column), and including all galaxies 
with / < 22.5 that do not lie within a rich (N m > 20) group, 
matched to the halo catalogue of poor (N m ^ 20) groups, for 
various minimum S/N limits and a Control. The purity shows no 
statistically significant decrease relative to the catalogue used for 
Table ^ showing that rich groups are not significantly biasing 
our purity and completeness upwards. The combined galaxy cat- 
alogues cover a total of 12 deg. 2 of simulated sky. Columns are 
as Table [U 



Cut 


Bap 


Nhit 


Npeak 


P 


C 




(m> 




(iVfield) 


Control 


0.5 


1705 


2988 


0.57 


N/A 


3.87 


1.37 


1.82 


6.64 


All Peaks 


0.5 


2767 


3756 


0.74 


0.19 


5.38 


2.28 


2.08 


6.87 


S/N > 2 


0.5 


2062 


2669 


0.77 


0.14 


6.06 


2.37 


2.18 


7.24 


S/N > 3 


0.5 


1299 


1591 


0.82 


0.09 


6.87 


2.38 


2.35 


7.63 


S/N > 4 


0.5 


585 


690 


0.85 


0.04 


8.31 


2.37 


2.62 


8.21 



could in actuality be little better than random for identify- 
ing poor groups. To test this, we took our simulated galaxy 
catalogues and removed all galaxies within them that were 
found to be a member of a rich group (which we considered 
any group having more than 20 members to be). We then 
ran our algorithm on this pruned catalogue and assessed its 
accuracy through the same method as before. The results of 
this test are summarized in Table [3] which shows that there 
is in fact very little effect on the accuracy of our algorithm 
when the richer groups are removed from consideration. The 
purity in fact rises slightly with this test, which may be due 
to P3 detecting the rich groups at a point removed from 
their centres, as these groups are possibly large enough that 
the distance between the peak S/N and the group center 
may exceed the matching length. 



3.1.4 Background contamination 

The best-match groups we compared our peaks to in Table[T] 
and Table [2] may not be the only causes of the high detected 
S/N. It is possible that other nearby groups and field galax- 
ies are also contributing to the S/N at these peaks. The 
(-^V ma t) an d (iVHeid) columns in these tables give the average 
total number of groups within the matching distance and 
the average number of field galaxies within the matching 
distance respectively. It can be seen from this that with the 
large aperture size and matching length, most of our peaks 
actually match to 2 or more groups, with (iV mcm ) = 7.2 
and approximately 8 field galaxies also within the matching 
distance. Thus typically, the interloper fraction is approx- 
imately 69%. Both of the number of matched groups and 
the field contamination decrease when lowering the match- 
ing length, but so does the measured purity. For the smaller 
aperture size, the lower matching length is likely a more rea- 
sonable measurement, as galaxies outside this smaller length 
won't be contributing to the measured S/N. 



mapping of 8 to number of members would not likely be 
useful. 

One potential concern was that the above-random 
match rate of our photometric galaxy catalogue to the halo 
catalogue might have been due primarily to a very high 
match rate among richer groups averaged with a lower match 
rate to poorer groups. If this were the case, then our method 



3.2 Optimization of the algorithm 

3.2.1 Aperture size (R ap ) 

Although decreasing the aperture size typically resulted in 
decreasing the purity of our catalogue at a given S/N cut, it 
also greatly increased the number of peaks detected. The im- 
portant measurement is whether the decreased aperture size 
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Figure 6. Purity as a function of the total number of peaks detected. Left plot shows simulations with CFHTLSpz errors, right plot 
shows simulations with COSMOS30pz errors. Blue squares: i? ap = 0.5 and / < 22.5. Magenta triangles: i? ap = 0.25 and / < 22.5. Red 
stars: i? ap = 0.5 and I < 24 (discussed in Section |3.2.2[ l. Each field from the Millennium simulation is represented by three points, for 
group catalogues limited by S/N > 2, S/N > 3, and S/N > 4. 



results in increased purity when the same number of groups 
are detected, or similarly, whether the decreased aperture 
size results in more groups detected at the same purity level. 
Fig. [6] shows a graphical representation of how the purity 
relates to the number of groups detected for both aperture 
sizes, along with the results of changing to a fainter magni- 
tude limit (discussed below in Section [3.2.2| |. From this plot, 
it is clear that the smaller aperture size is beneficial, though 
it may require a larger S/N cut to attain sufficient purity. 



3.2.2 Magnitude limit 

Although galaxies with / > 22.5 have less accurate red- 
shifts, they should still provide some redshift information 
that could be useful for identifying groups. In theory, poor- 
precision redshift data can be used to increase the accuracy 
of group identifications. 

We ran the P3 algorithm on our simulated galaxy cat- 
alogue, using all galaxies with I < 24, assigning errors to 
galaxies with 22.5 < / < 24 of twice as much as for galax- 
ies with / < 22.5, resulting in 0.10 for the CFHTLSpz errors 
and 0.04 for the COSMOS30pz errors. In the end, using faint 
galaxy photometric redshifts showed a small decrease in the 
purity of group-finding for CFHTLSpz errors, but a small 
increase for COSMOS30pz errors, as can be seen in a com- 
parison of Tableland Table [l] As a result, it will not be 
worth the extra computation time to include galaxies with 
photo-z errors > 0.05 in runs of P3 on real data. 

As the locations of fainter galaxies are highly correlated 
with the locations of bright galaxies, the effect of using them 
in the P3 algorithm is to increase the contrast of the existing 
S /N distribution, albeit at a lower resolution in the redshift 
dimension, as can be seen in Fig. [7] The lower redshift res- 
olution also manifests in a greater error in determination 



of the peak redshift, which may result in more peaks being 
scattered away from the group they represent by more than 
Zmat ■ It is also possible that multiple groups along the line of 
sight may become blended into a single structure, with the 
peak of this structure lying between the two groups. When 
the peak catalogue is compared to the actual locations of the 
groups, it is possible that it is within our threshold matching 
distance of neither. 

The result of our optimization analysis is that the 
0.25 /i -1 Mpc aperture size performs better, and a S/N cut of 
3 provides a good balance of purity and completeness with- 
out overly biasing us toward rich groups. A resolution of 
^0.1 h~ 1 Mpc is preferred with this aperture size, although 
it is more computationally expensive. The magnitude limit 
for a real galaxy catalogue will depend on the distribution 
of o z vs. z. A magnitude which limits a z to a maximum of 
0.05 will likely provide the best results, as it was when the 
simulated errors for fainter galaxies went above this level 
that the purity began to decrease. 



3.3 Comparison to mock spectroscopic 
"Friends-of-friends" groups 

In the following section we will compare the P3 group cata- 
logue to spectroscopically-identified groups from zCOSMOS. 
When working with real-world data, peculiar velocities of 
galaxies in groups cause small redshift errors, which any 
group-finding method must adapt for. The result of this is 
that spectroscopic identification of groups suffers from its 
own imperfections in purity and completeness. A compari- 
son of the P3 group catalogue to a spectroscopic catalogue 
will let us assess how much imperfections in real galaxy cat- 
alogues will affect group-finding. 

To make this assessment, we first need to determine how 
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Table 4. Summary of the purity and completeness of the P3 algorithm applied to the Millennium fields with R ap = 0.5 (comparable to 
Table [l] top row), and including all galaxies with / < 24 when matched to the halo catalogue of groups, containing 31668 total groups, 
for various minimum S/N limits and a Control. The combined galaxy catalogues cover a total of 12 deg. 2 of simulated sky. Columns are 
as Table[T] 



CFHTLSpz errors COSMOS30pz errors 



Gut 




Nhit 


Npeak 


P 


C 




<m) 


(«„,} 


(Afield) 


Nhit 


Npeak 


P 


C 


(N m ) 


<m) 


(Wmat) 


(Afield) 


Control 


0.5 


1705 


2988 


0.57 


N/A 


3.87 


1.37 


1.82 


6.64 


1705 


2988 


0.57 


N/A 


3.87 


1.37 


1.82 


6.64 


All Peaks 


0.5 


2006 


2592 


0.77 


0.15 


7.16 


3.28 


2.16 


6.32 


2125 


2470 


0.86 


0.19 


7.43 


3.46 


2.36 


6.61 


S/N > 2 


0.5 


1655 


2055 


0.81 


0.13 


7.86 


3.52 


2.24 


6.54 


1878 


2088 


0.90 


0.17 


8.00 


3.74 


2.44 


6.85 


S/N > 3 


0.5 


1333 


1607 


0.83 


0.10 


8.80 


3.74 


2.30 


6.72 


1506 


1619 


0.93 


0.14 


9.07 


4.19 


2.58 


7.20 


S/N > 4 


0.5 


964 


1124 


0.86 


0.08 


10.52 


4.07 


2.44 


7.19 


979 


1013 


0.97 


0.09 


11.78 


5.10 


2.81 


7.75 




Figure 7. S/N maps generated by running P3 on the simulated catalogue with CFHTLSpz errors and R ap = 0.5 h~ 1 Mpc. Left plot was 
generated from a run which included all galaxies with / < 22.5, and the right plot was generated from a run which included all galaxies 
with / < 24. Galaxies with 22.5 < / < 24 were assigned errors of twice as much as the brighter galaxies. It can be seen here that including 
fainter galaxies tends to increase the contrast of the distribution by increasing the S/N in apparent structures, but there is no significant 
change to the shape of it. 



much the purity and completeness drops when we switch 
from a comparison with the halo positions to a comparison 
with a catalogue that better simulates what we might ob- 
tain with real galaxy catalogues. For this purpose, we ran 
a friends-of-friends algorithm using the redshifts of galax- 
ies in the Millennium simulation. The generated catalogue 
contained a total of 39101 groups with at least two bright 
members, with approximately 3300 groups per deg. 2 This 
are approximately 25% more groups than the halo catalogue, 
and the effect of this can be seen in the increased 'purity' 
of the Control catalogue. The increase in the Control purity 
is less than 25%, which is likely due to many of the new 
FoF groups lying close together, as would happen if some 
real groups were detected as multiple FoF groups. A Con- 
trol peak which lies within the matching threshold of both of 
these groups is only counted as a single match, so 25% more 
groups, some of which lie near other groups, will result in a 
purity increase of less than 25%. Interestingly, the average 
number of field galaxies detected near each peak dropped, 
which may be due to the FoF algorithm matching some field 
galaxies into spurious pairs. 

Table [5] shows the results of a comparison with this cat- 
alogue. The Control 'purity' for this group catalogue has 
increased by ~ 9% relative to the halo catalogue, the mea- 
sured purity for peak catalogs has increased by ~ 3 — 7%, 



the average number of galaxies in each matched group has 
decreased by ~ 20 — 40%, and the average number of groups 
matched has increased by ~ 20 — 35%. This result is most 
likely caused by the FoF algorithm fragmenting real groups 
into multiple FoF groups. The result of this would be a 
higher density of groups with fewer members per group, con- 
sistent with the observed results. The lower number of field 
galaxies matched on average, however, is not explained by 
groups being fragmented. This is possibly a result of field 
galaxies being grouped into spurious pairs by the FoF al- 
gorithm. Both of these hypotheses are consistent with the 
observation, as can be seen in Fig. [lO] below, that the FoF 
catalogue contains more poor groups than the halo catalogue 
and fewer rich groups. 

3.4 zCOSMOS Galaxy Catalogues 

In addition to simulated galaxy catalogues, we tested 
the P3 algorithm on galaxy catalogues from the COS- 
MOS/CFHTLS D2 field. This field has spectroscopic red- 
shifts for a large number of the galaxies, in addition to pho- 
tometric redshifts based on u *griz photometry from |Ilbert| 
et al. (20061 with errors of a z ~ 0.05 for V < 22.5, and also 



much smaller errors from the COSMOS 30 band photometry 
( Ilbert et al.|2009| ) (a z ~ 0.02 for i' < 22.5). This allows us 
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Table 5. Summary of the purity and completeness of the P3 algorithm when its results are matched to a group catalogue obtained 
through a Friends-of-Friends algorithm applied to galaxies from the Millennium simulation using their given redshifts, containing 39101 
total FoF groups, for various minimum S/N limits and a Control. The combined galaxy catalogues cover a total of 12 deg. 2 of simulated 
sky. Columns are as Table [T] 



CFHTLSpz errors COSMOS30pz errors 



Cut 


Rap 


Nhit 


Npeak 


P 


C 


(JVm) 


(m) 


<JV mat > 


(Afield) 


Nhit 


Npeak 


P 


C 




(m) 


(Wmat) 


(Afield) 


Control 


0.5 


1960 


2988 


0.66 


N/A 


3.68 


0.76 


2.19 


4.02 


1960 


2988 


0.66 


N/A 


3.68 


0.76 


2.19 


4.02 


AU Peaks 


0.5 


2643 


3439 


0.77 


0.17 


4.31 


1.30 


2.65 


4.85 


2549 


2805 


0.91 


0.21 


4.42 


1.58 


3.06 


5.76 


S/N > 2 


0.5 


1920 


2406 


0.80 


0.13 


4.77 


1.40 


2.84 


4.98 


1951 


2048 


0.95 


0.17 


4.93 


1.80 


3.40 


6.07 


S/N > 3 


0.5 


1238 


1496 


0.83 


0.09 


5.39 


1.55 


3.07 


5.17 


1017 


1025 


0.99 


0.10 


6.53 


2.31 


4.17 


6.47 


S/N > 4 


0.5 


632 


725 


0.87 


0.05 


6.86 


1.78 


3.32 


5.36 


413 


415 


1.00 


0.05 


9.35 


3.64 


4.80 


6.71 


Control 


0.25 


1960 


2988 


0.66 


N/A 


3.68 


0.76 


2.19 


4.02 


1960 


2988 


0.66 


N/A 


3.68 


0.76 


2.19 


4.02 


All Peaks 


0.25 


13791 


20727 


0.67 


0.65 


3.68 


0.98 


2.15 


4.05 


13256 


18724 


0.71 


0.67 


3.72 


0.99 


2.20 


4.12 


S/N > 2 


0.25 


8637 


10904 


0.79 


0.44 


4.28 


1.12 


2.44 


4.40 


6662 


7160 


0.93 


0.39 


4.84 


1.36 


2.75 


4.75 


S/N > 3 


0.25 


3152 


3537 


0.89 


0.18 


6.23 


1.35 


3.00 


4.83 


1688 


1715 


0.98 


0.13 


8.76 


2.25 


3.72 


5.39 


S/N > 4 


0.25 


906 


986 


0.92 


0.06 


10.18 


1.68 


3.54 


5.22 


444 


448 


0.99 


0.04 


15.81 


3.02 


4.04 


5.79 



to better see what purity we can expect from when we run 
the P3 algorithm on the CFHTLS-Wide survey, and how the 
accuracy might improve in surveys with better photometric 
redshift accuracy. 

We have used the spectroscopic group catalogue by 



Knobel et al. (2009) to assess the accuracy of our method. 
The catalogue contains 604 groups in redshifts between 
zio = 0.2 and zm = 0.8 over 1.7 deg. 2 , which is only ~ 13% as 
many groups as were detected in the same area with the sim- 
ulated galaxy catalogues. The discrepancy is not accounted 
for by the completeness of this catalogue, claimed to be 85% 
by |Knobel et al. for sampled galaxies, with a 70% sampling 
rate. ( Lilly et al.||2007| ) The discrepancy can be seen illus- 
trated dn~ r FigT^)J which shows [Knobel et al.| s group counts 
for varying richness, corrected for completeness, sampling 
rate, and the differing galaxy density of their D2 field from 
the Millenium fields, compared to our group catalogues. 

The discrepancy is better explained by the lower com- 
pleteness of |Knobel et al.| s group- finding method for poor 
groups. As can be seen from Fig. 2 from Knobel et al. ( 2009 1, 
for groups with 10 or few members, which have a typical 
mass of 10 13 Ii~ 1 Mq, the completeness is less than 40%. Ad- 
ditionally, Kn obel et al.| noted that the D2 field appears to 
have an unusually low density of rich groups, which explains 
the cut-off seen in Fig. [10] 

In comparing the photometric redshifts to spectroscopic 
redshifts, we found that there appeared to be a small, but 
significant, offset between the photometric redshifts pro- 



vided by Ilbert et al. (2006) and the spectroscopic red- 
shifts. To correct for this, we performed a sky match on 
galaxies present and both of the catalogues and fit a linear 
correction function to their redshifts, of the form z rea i = 
0.957z p hot + 0.00843. This correction allowed us to prop- 
erly match our detected groups to those from |Knobel et aLj 
Although this correction had a significant effect on the ap- 
parent quality of our group-matching, it is unlikely to have 
any significant effect on lensing measurements. 

Fig. [8] and Fig. [9] show a graphical representation of the 
S/N in selected redshift slices for the D2 field. Although the 
catalogue of FoF-identified groups is significantly sparser, 
most P3 groups do correspond to a FoF-identified group. 

As can be seen in Table [6] P3's purity using real pho- 
tometric redshifts and spectroscopically-identified groups is 
somewhat worse than its purity from using simulated galaxy 



■a 




Number of Members 

Figure 10. A comparison the number of groups of varying num- 
bers of members identified in the magnitude range i' < 22.5 for 
both the Millennium simulation generated through halo cata- 
logues (magenta dotted) and through a friends-of-friends algo- 
rithm (red dashed), and for the zCOSMOS FoF-groups of [Knobel] 
|et a.l.| ( [2009] ) (blue solid). We can assess completeness by measur- 
ing how far the zCOSMOS plot lies to the left of the Millennium 
plot. This gives us an estimate of 20-40% completeness among 
the groups detected. 



catalogues with CFHTLSpz errors, as would be expected 
from the inaccuracies inherent in real-world data. Addition- 
ally, the group catalogue suffers from incompleteness due 
to the inaccuracy of the group-finding algorithm and the in- 
complete sampling of galaxies in the field. As mentioned pre- 
viously, the completeness of this group catalogue for groups 
of mass less than 10 14 Mq is ~ 40%. These completeness 
effects are also evident in the smaller average number of 
members in our detected groups, compared to the averages 
for the simulated catalogues. They are also evident in the 
fact that the Control catalogue shows lower 'purity' for this 
field than for the Millennium fields. Fig. [10] compares the 
group counts for various memberships of the D2 and Millen- 
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Figure 8. The calculated S/N for the 8 of galaxies on a grid of points in R.A. and Dec, sliced at different values of the redshift , 
for the D2 field, using i? ap = 0.5 h _1 Mpc. S/N is indicated by the colour. Locations of real groups detected through |Knobel et al.| s 
fricnds-of-friends algorithm are indicated by white circles, with their sizes indicating the sizes of the groups. White crosses indicate the 
location of a circle in a nearby layer, within our threshold redshift for being considered a match. Detected peaks with a S/N of more than 
2 are indicated by the black diamonds. Peaks are detected in three dimensions, so what appear to be peaks in the individual plots may 
actually be detected on another slice. Additionally, peaks have a threshold radius within which they must be the highest point to count 
as a peak, so some peaks may not be detected if they are sufficiently close to another peak. Left column shows data derived from the 
Ilbcrt ct al. ( 2006J photo-zs, with errors similar to the CFHTLSpz errors used previously, and the right column shows data derived from 
the COSMOS-30 photo-zs, with errors similar to the COSMOS30pz erros used previously. Redshift slices, from top to bottom: 0.58, 0.60, 
0.62. The COSMOS plot shows the interesting effect that many galaxies are individually resolvable, as their redshift errors are smaller 
than the width of the slices. These galaxies appear as solid circles in the plot. 
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Figure 9. An alternate view of the plots fro m Fig. |8| using a sl ice at constant Dec, showing how far groups extend in the redshift 
dimension. The left plot was constructed with |llbert et ah| 1 2006| 's photo-zs and ij ap = 0.5 li _1 Mpc; the right was constructed with 
the COSMOS-30 photo-zs and i? a p = 0.5 h~ 1 Mpc. Note the differing scale in the comparison, as the COSMOS-30 catalogue covers a 
somewhat larger area. Here, we can clearly see in the COSMOS-30 plot the contraction of groups in the redshift dimension. 



Table 6. Summary of the purity and completeness of the P3 algorithm applied to the galaxies in the D2 field. The combined galaxy 
catalogues cover approximately 1 deg. 2 of sky. Columns are as Table [l] Masses were calculated through an abundance matching technique 
by Knobcl ct al. 
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0.28 


N/A 


3.42 
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0.00 
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nium fields, directly illustrating the incompleteness of the 
D2 galaxy and group catalogues. 

We also attempted to assess how accurately we will be 
able to determine the size of a group from its local S us- 
ing these catalogues, illustrated in Fig. |11| Although the 
group catalogue is significantly sparser than the catalogues 
extracted from the Millennium simulation, a positive corre- 
lation between 8 and the number of members in a group can 
still be seen. However, the trend is not significant enough to 
allow us to make future estimates of the richness of groups 
from their S. 



4 PRELIMINARY APPLICATION TO THE 
CFHTLS- WIDE 

The CFHTLS- Wide survey is a 170 deg. 2 survey over four 
patches of sky, taken by the Canada-France-Hawaii Tele- 
scope. Photometric redshifts for the CFHTLS- Wide were 
prepared with the methods described in |Erben et al.| ( |2009[ ) 
and Hildebrandt et al. ( 2009 1 in the framework of the 



CFHTLenS collaboration. The catalogues will be described 
in detail in a forthcoming paper (Hildebrandt et al. in prepa- 
ration). The photo-z's are based on the publicly available 
BPZ code ( |Bemtez|2000"l ), yielding an accuracy of a z ~ 0.03 
for i' < 22.5. Lensing-quality shear measurements for galax- 
ies within the survey are currently in preparation. 

We used the following parameters for our preliminary 
run of P3 on the currently-available data from the Wide 
survey: 

• Rn P = 0.25 Mpc. There was a significant benefit to 
the smaller aperture size when P3 was applied to simulated 
galaxy catalogues. This wasn't the case with real catalogues, 
but that may be due to a low sampling rate. 

• i' < 22.5. Tests of P3 which included galaxies in the 
range 22.5 < / < 24 showed a small decrease in purity for 
comparable numbers of groups detected. 

• S/N > 3. This limit typically provided the best balance 
of purity and number of groups detected. 

With these parameters, we detected a total of 18813 



14 Bryan R. Gillis et al. 



Number of Members 



Number of Members 



Figure 11. The overdensity <5 calculated by the P3 algorithm at the location of every FoF-identified group (small dots) versus the 
number of members in those groups for the D2 field. Also shown are the <5 of all estimated groups and the number of bright ( V < 22.5) 
members of the FoF-idcntificd group they match to (large dots with error bars). The numbers of members for all data points have a 
random component of less than 1 included in order to aid viewing. Left plot shows simulations with photometry from |Ilbert et aD (2006), 
right plot shows simulations with photometry from the COSMOS-30 survey. 
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Figure 12. Number of groups (peaks with S/N > 3) detected over redshift. Left plot uses i? ap = 0.5 h 1 Mpc, right plot uses i? ap = 
0.25 ft - 1 Mpc. Vertical axis has been scaled to give groups per h 1 Gpc 3 . We appear to be getting more incomplete with redshift, much 
more notably with the smaller aperture size. This is due to the apparent magnitude limit of the catalogue reducing the number of galaxies 
detected for higher-redshift groups, which decreases the chance that a given group will be detected by P3. 



groups over the 78 fields available. The fields have an av- 
erage unmasked area of ~ 0.95 deg 2 ., giving an average of 
241 groups/deg 2 . We expect approximately 80% of these 
detected groups to correspond to real groups. This purity is 
comparable to the similar work done bylAdami et al. ( 2010 1, 



but the detected group density is much higher than their 
19.2 clusters/deg 2 . 

The distribution of groups as a function of redshift is 



shown in Fig. |12| P3's group count gets more incomplete 
with increasing redshift, and this effect is more pronounced 
when using the smaller aperture size. The likely cause of this 
is that, at low redshifts, we can detect groups with just a 
few members. If one of these groups were moved to a higher 
redshift, fewer members would be detected, and the group 
itself would fall below the threshold for detection. The larger 
aperture size detects only richer groups in the first place, and 
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Table 7. Summary of the purity and completeness of the P3 al- 
gorithm applied to the available CFHTLS-Wide fields with i? ap 
= 0.5 Mpc to the cluster catalogue provided b y |Lu et al.| Our 
catalogue shows an above random correlation to |Lu et al.j s cata- 
logue, with approximately 31% purity and 69% completeness for 
^ap = 0.25 ft -1 Mpc and a S/N cut of 3. Columns are as Ta- 
ble[l] with (iV m ) referring to the average number of red sequence 
galaxies in detected and matched groups. 
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0.45 
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S/N > 3 


0.25 


2524 
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0.31 


0.69 


7.11 


S/N > 4 


0.25 


2014 


4284 


0.47 


0.45 


8.15 



these groups are more likely to have enough members that 
would still be visible were the group at a higher redshift. 
We compared the results of P3 to Lu et al. ( 20091) 's 



cluster catalogue for the Wide fields, as shown in Table uj 
finding a purity of 31%. Although this purity is low com- 
pared to our previous results, this is to be expected. As |Lu| 
|et al.| s catalogue was derived through searching for galaxies 
on the red sequence, it consists primarily of clusters of 10 
or more large red galaxies, while our catalogue contains a 
large number of poorer groups that we would expect to not 
match to anything in |Lu et aLj s catalogue, resulting in a 
decreased apparent purity. The important point is that our 
purity is significantly above the 11% attained with our Con- 
trol catalogue. Our completeness here is 69%, which raises 
the question of why, if |Lu et al.| s groups are some of the 
richest visible, they weren't detected by our algorithm. An 
inspection of S/N maps generated from the Wide catalogues 
showed that the groups P3 didn't pick up were typically 
missed for one of the following reasons: 

• The groups were part of a structure which had a large 
extent in the sky, and P3 picked a different peak from |Lu| 
let al.l s centre. 

• |Lu et aL| s group was near the edge of the field or the 
redshift range. P3 will not report a peak detected at the 
edge of the field or at the redshift limits, as it is impossible 
to know if this is the actual peak, or the real peak lies outside 
the analyzed range. 

• |Lu et aTj s group lay near a heavily-masked region, 
where a high S/N is less likely. 



5 CONCLUSION 

In this work we developed and tested a method identify- 
ing a very pure sample of galaxy groups using photometric 
redshifts. We predict that the method will result in a pu- 
rity of ~ 84% for the quality of photometry present in the 
CFHTLS-Wide survey. Running our algorithm on the avail- 
able fields, we detected an average of 241 groups per square 
degree field in the redshift range 0.2 < z < 0.8, which will 



give a predicted 41000 groups once photometry from the en- 
tire, 170 square degree survey is available. From simulated 
data, we estimate that our groups have an average member- 
ship of ~ 9 bright (I < 22.5) galaxies. 

Our group-finding method shows a limited ability to 
estimate the size of detected groups through their local 8. 
There is a positive correlation between the number of mem- 
bers in a group and its 5, although the significantly larger 
number of smaller groups makes a direct estimate of the 
number of members impractical. 

Our detected distribution of groups over redshift is con- 
sistent with past results (?). Our group catalogue of the 
presently-available CFHTLS-Wide fields is consistent with 



Lu et al. (2009 1 's catalogue, to the extent that would be 



expected given the differences in our methods. 
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